TensorBoard Jupyter Notebook Helpers


In [5]:
# TensorBoard Helper Functions and Constants

# Directory to export TensorBoard summary statistics, graph data, etc.
TB_DIR = '/tmp/tensorboard/tf_basics'


def _start_tb(d):
    """
    Private function that calls `tensorboard` shell command
    
    args:
      d: The desired directory to launch in TensorBoard
    """
    !tensorboard --port=6006 --logdir=$d

def start_tensorboard(d=TB_DIR):
    """
    Starts TensorBoard from the notebook in a separate thread.
    Prevents Jupyter Notebook from halting while TensorBoard runs.
    """
    import threading
    threading.Thread(target=_start_tb, args=(TB_DIR,)).start()
    del threading

def stop_tensorboard():
    """
    Kills all TensorBoard processes
    """
    !ps -aef | grep "tensorboard" | tr -s ' ' | cut -d ' ' -f2 | xargs kill -KILL
    
def reset_tensorboard():
    stop_tensorboard()
    start_tensorboard()

TensorFlow Fundamentals


In [6]:
# Import core TensorFlow libraries
import tensorflow as tf
import numpy as np

My First TensorFlow Graph

Step 1: Define the graph

  • Nodes represent some sort of computation
  • Edges represent data transfer from one computation to the next

In [7]:
# `tf.placeholder` creates an "input" node- we will give it value when we run our model
a = tf.placeholder(tf.int32, name="input_a")
b = tf.placeholder(tf.int32, name="input_b")

In [8]:
# `tf.add` creates an addition node
c = tf.add(a, b, name="add")

# `tf.mul` creates a multiplication node
d = tf.mul(a, b, name="multiply")

In [9]:
# Add up the results of the previous two nodes
out = tf.add(c, d, name="output")

In [10]:
# OPTIONAL
# Create a scalar summary, which will log the value we tell it to when executed
# In this case, we'll tell it to save our output value from `out`
# This works in tandem with our SummaryWriter below
# To create the summary, we pass in two parameters:
# 1. A 'tag', which gives a label to the data
# 2. The value(s) we'd like to save
# We also give a `name` to the summary itself (does not affect behavior)
out_summary = tf.scalar_summary("output", out, name="output_summary")

Step 2: Run the graph

  • Start a tf.Session to launch the graph
  • Setup any necessary input values
  • (Recommended) Use a tf.train.SummaryWriter to write information for TensorBoard
  • Use Session.run() to compute values from the graph

In [11]:
# Start a session
sess = tf.Session()

In [12]:
# Create a "feed_dict" dictionary to define input values
# Keys to dictionary are handles to our placeholders
# Values to dictionary are values we'd like to feed in
feed_dict = { a: 4, b: 3 }

In [13]:
# OPTIONAL
# Opens a `SummaryWriter` object, which can write stats about the graph to disk
# We pass in two parameters into the SummaryWriter constructor
# The first is a string, specifies a directory to write to. 
#   (Note: `TB_DIR` was specified earlier. "TB" stands for TensorBoard
# The second parameter passes in our graph. This allows us to visualize our graph later
writer = tf.train.SummaryWriter(TB_DIR, graph=sess.graph)

In [14]:
# Execute the graph using `sess.run()`, passing in two parameters:
# The first parameter, `fetches` lists which node(s) we'd like to receive as output
# The second parameter, `feed_dict`, feeds in key-value pairs 
#   to input or override the value of nodes
# In this case, we run both the output value, as well as its scalar summary
result, summary = sess.run([out, out_summary], feed_dict=feed_dict)

# Print output with fun formatting
print("(({0}*{1}) + ({0}+{1})) = ".format(feed_dict[a], feed_dict[b]) + str(result))


((4*3) + (4+3)) = 19

In [15]:
# We add the summary to our SummaryWriter, which will write them to disk:
# Normally, these summaries are used to generate statistics over time
# TensorBoard doesn't do well visualizing single points, so we fake a "global_step"
# With two points, it will generate a line
writer.add_summary(summary, global_step=0)
writer.add_summary(summary, global_step=100)

In [16]:
# Use SummaryWriter.flush() to write all previously added summaries to disk
# This will also flush the list of summaries so that none are added twice
writer.flush()

In [17]:
# We're done! Close down our Session and SummaryWriter to tidy up.
# Note that SummaryWriter.close() automatically calls flush(), so any summaries left will be written to disk
sess.close()
writer.close()

Step 3ish: Use TensorBoard for Visualization


In [ ]:
# Start TensorBoard
start_tensorboard()

Go to your server's ip at port 6006 (replace 1.2.3.4 with your server's ip):

http://1.2.3.4:6006

Note that start_tensorboard() is a convenience function defined above. Normally, one would start TensorBoard in a terminal with a command like this (assuming TensorFlow was installed with pip):

$ tensorboard --logdir=/path/to/SummaryWriter/dir

Explore TensorBoard!


In [18]:
# Once you are done, stop TensorBoard
stop_tensorboard()

Here's the main code all together without as many comments in the way:

# Define inputs
a = tf.placeholder(tf.int32, name="input_a")
b = tf.placeholder(tf.int32, name="input_b")

# First "layer" of transformations
c = tf.add(a, b, name="add")
d = tf.mul(a, b, name="multiply")

# Output node and associated summary
out = tf.add(c, d, name="output")
out_summary = tf.scalar_summary("output", out, name="output_summary")

# Start a session
sess = tf.Session()

# Define our "input" dictionary
feed_dict = { a: 4, b: 3 }

# Open a SummaryWriter
writer = tf.train.SummaryWriter(TB_DIR, graph=sess.graph)

# Compute the values of our output node and its summary
result, summary = sess.run([out, out_summary], feed_dict=feed_dict)

# Write summary to disk
writer.add_summary(summary, global_step=0)
writer.add_summary(summary, global_step=100)

# Close out of session and writer objects
sess.close()
writer.close()

TensorFlow Core API


Tensor Objects

What is a Tensor?

Tensors, simply put, are n-dimensional matrices. A 0-dimensional tensor is a single number (or scalar), a 1-dimensional tensor is a vector, and a 2-dimensional tensor is a standard matrix. Higher dimensional tensors are simply referred to as an "n-D tensor"

Every value that is passed through a TensorFlow model is a Tensor object- the TensorFlow representation of a tensor.

Defining tensors by hand

You can define Tensor object values in two main ways:

  1. Native Python types
  2. NumPy arrays (recommended)

Both of these are able to be automatically converted into TensorFlow Tensor objects.

Native Python


In [ ]:
# 0-D tensor (scalar)
t_0d_py = 4

# 1-D tensor (vector)
t_1d_py = [1, 2, 3]

# 2-D tensor (matrix)
t_2d_py = [[1, 2], 
           [3, 4], 
           [5, 6]]

# 3-D tensor
t_3d_py = [[[0, 0], [0, 1], [0, 2]],
           [[1, 0], [1, 1], [1, 2]],
           [[2, 0], [2, 1], [2, 2]]]

NumPy Arrays

Pretty much the same as native Python, but with the numpy.array function wrapping it:


In [ ]:
# 0-D tensor (scalar)
t_0d_np = np.array(4, dtype=np.int32)

# 1-D tensor (vector)
t_1d_np = np.array([1, 2, 3], dtype=np.int64)

# 2-D tensor (matrix)
t_2d_np = np.array([[1, 2], 
                    [3, 4], 
                    [5, 6]],
                   dtype=np.float32)

# 3-D tensor
t_3d_np = np.array([[[0, 0], [0, 1], [0, 2]],
                    [[1, 0], [1, 1], [1, 2]],
                    [[2, 0], [2, 1], [2, 2]]],
                   dtype=np.int32)

Data types

In general, using np.array (or np.asarray) is the recommended way of defining values for tensors by hand in TensorFlow. The primary reason for this is that you can specify the exact data type ("dtype") you'd like the values to be represented with. For example, there's no way to specify a 32-bit integer vs a 64-bit integer with native Python. TensorFlow is tightly integrated with NumPy, and most TensorFlow data types have a corresponding NumPy dtype:

TensorFlow type Equivalent NumPy type Description
tf.float32 np.float32 32 bit floating point.
tf.float64 np.float64 64 bit floating point.
tf.int8 np.int8 8 bit signed integer.
tf.int16 np.int16 16 bit signed integer.
tf.int32 np.int32 32 bit signed integer.
tf.int64 np.int64 64 bit signed integer.
tf.uint8 np.uint8 8 bit unsigned integer.
tf.string N/A String type, as byte array
tf.bool np.bool Boolean.
tf.complex64 np.complex64 Complex number made of two 32 bit floating point numbers: real and imaginary parts.
tf.qint8 N/A 8 bit signed integer used in quantized Ops.
tf.qint32 N/A 32 bit signed integer used in quantized Ops.
tf.quint8 N/A 8 bit unsigned integer used in quantized Ops.

Slightly modified version of this table


In [19]:
# Just to show that they are equivalent
(tf.float32 == np.float32 and
 tf.float64 == np.float64 and
 tf.int8 == np.int8 and
 tf.int16 == np.int16 and
 tf.int32 == np.int32 and
 tf.int64 == np.int64 and
 tf.uint8 == np.uint8 and
 tf.bool == np.bool and
 tf.complex64 == np.complex64)


Out[19]:
True

The primary exception to when you should not use np.array() is when defining a Tensor of strings. When using strings, just use standard Python lists. It's best practice to include the b prefix in front of strings to explicitly define the strings as byte-arrays:


In [ ]:
tf_string_tensor = [b"first", b"second", b"third"]

Tensor Shapes

A common term in TensorFlow is a Tensor object's "shape". A shape value is a list or tuple containing an ordered set of integers. The i-th element in the list describes the length of the i-th dimension in the tensor, while the number of elements in the list defines the dimensionality of the tensor. Here are some examples:


In [ ]:
# Shapes corresponding to scalars
# Note that either lists or tuples can be used
s_0d_list = []
s_0d_tuple = ()

# Shape corresponding to a vector of length 3
s_1d = [3]

# Shape corresponding to a 2-by-3 matrix
s_2d = (2, 3)

# Shape corresponding to a 4-by-4-by-4 cube tensor
s_3d = [4, 4, 4]

s_var = [None, 4, 4]

You can use the tf.shape Operation to get the shape value of Tensor objects:


In [20]:
with tf.Session() as sess:
    get_shape = tf.shape([[[1, 2, 3], [1, 2, 3]],
                          [[2, 4, 6], [2, 4, 6]],
                          [[3, 6, 9], [3, 6, 9]],
                          [[4, 8, 12], [4, 8, 12]]])
    shape = sess.run(get_shape)
    print("Shape of tensor: " + str(shape))


Shape of tensor: [4 2 3]

Constants

You can create Tensor constants in your TensorFlow graph easily. Just use the tf.constant function:


In [21]:
my_const = tf.constant(np.array([1, 2, 3], dtype=np.float32))

If a set of values is going to be reused all throughout your graph, using constants is an easy way to place that value directly into the graph (instead of reading from a NumPy array or Python list directly)

Note: all Tensor objects are immutable. The constant type is simply a convenient way to add basic Tensor values to a graph.

A Note on SparseTensor

TensorFlow has implementations of sparse tensor representations, or tensors whose entries primarily consist of zeros. In some instances, SparseTensor and Tensor objects can be intermixed, but more often than not they require more care. Because the SparseTensor API isn't as robust as the Tensor API and for the sake of keeping things digestible, we won't cover SparseTensor objects today.

Operations

TensorFlow Operation objects (also referred to as "Ops" in the TensorFlow documentation- we will avoid that usage today to avoid mixing DevOps and TensorFlow Ops) are nodes that perform compuation on or with Tensor objects. They take as input zero or more Tensor objects (or objects that can be converted into tensors- see the previous section), and output zero or more tensors. These outputs can then either be returned to the client or passed on to further Operations. Operations are the fundamental building blocks of any TensorFlow graph- their calculations represent nodes, and data flowing from one to the next represents edges.

We've already seen several Operations earlier: tf.add and tf.mul are classic examples: they both take in two tensors and output one. When given non-scalar values, they do addition/multiplication element-wise.


In [22]:
# Initialize some tensors
a = np.array([1, 2], dtype=np.int32)
b = np.array([3, 4], dtype=np.int32)

# `tf.add()` creates an "add" Operation and places it in the graph
# The variable `c` will be a handle to the output of the operation
# This output can be passed on to other Operations!
c = tf.add(a, b)

The important thing to remember is that Operations do not execute when created- that's the reason tf.add([1, 2],[3, 4]) doesn't return the value [4, 6] immediately. It must be passed into a Session.run() method, which we'll cover in more detail below.


In [23]:
sess = tf.Session()
print(sess.run(c))

c_result = sess.run(c)


[4 6]

The majority of the TensorFlow API is Operations. tf.scalar_summary and tf.placeholder were both Operations we used in the first example- remember that we had to run the out_summary variable in Session.run()

In addition to Operation-specific inputs, each Operation can take in a name parameter, which can help identify Operations in TensorBoard and other tools.


In [ ]:
c = tf.add(a, b, name="my_add_operation")

Getting into the habit of adding names to your Operations now will save you headaches later on.

TensorFlow Graph Objects

When TensorFlow is imported into Python, it automatically creates a Graph object and makes it the default graph. You can create more graphs as well:


In [24]:
# Create a new graph - constructor takes no parameters
new_graph = tf.Graph()

However, operations (such as tf.add and tf.mul) are added to the default graph when created. To add operations to your new graph, use a with statement along with the graph's as_default() method. This makes that graph the default while inside of the with block:


In [25]:
#DEFAULT GRAPH

co = tf.constant(4)

with new_graph.as_default():
    a = tf.add(3, 4)
    b = tf.mul(a, 2)
    other_co = tf.constant(6)

In [31]:
sess = tf.Session(graph=tf.get_default_graph())
sess.run(co)


Out[31]:
4

The default graph, other than being set to the default, is no different than any other Graph. If you need to get a handle to the default graph, use the tf.get_default_graph function:


In [26]:
default_graph = tf.get_default_graph()

Note: get_default_graph() will return whatever graph is set to the default, so if you are inside of a with g.as_default() block, get_default_graph() will return g:


In [33]:
with new_graph.as_default():
    print(new_graph is tf.get_default_graph())

print(new_graph is tf.get_default_graph())


True
False

Most TensorFlow models will not require more than one graph per script. However, you may find this useful when defining two independent models side-by-side. Additionally, there are mechanisms to export and import external models and load them in as Graph objects, which can allow you to feed the output of existing models into your new model (or vice versa). We won't be able to demonstrate these now, but see Graph.as_graph_def() and tf.import_graph_def in the TensorFlow API for more information.

Sessions

Creating Sessions

As we saw earlier, Session objects are used to launch and execute graphs. Earlier, we created a session using its default constructor, but it has three optional parameters:

  • target specifies the execution engine to use. By default it is the empty string, which causes the Session to use the standard local execution context. Typically, this parameter is only used when using TensorFlow in a distributed setting
  • graph specifies which Graph object the session should run. The default value is None, which causes the Session to load in the default graph. Sessions only manage one graph at a time, so executing more than one graph will require more than one session
  • config allows users to specify advanced options to configure the session. We won't cover this today, but some things that are available are: limiting the number of CPUs/GPUs used, logging options, and changing optimization of the graph

In [34]:
# A session with the default graph launched
# Equivalent to `tf.Session(graph=tf.get_default_graph())`
sess_default = tf.Session()

# A session with new_graph launched
sess_new = tf.Session(graph=new_graph)

Running Sessions

The most important method of a Session is its run() function. Earlier in this notebook, we saw basic usage of the two primary parameters to run(): fetches and feed_dict.

fetches

fetches expects a list of Tensor and/or Operation handles (or just a single Tensor/Operation). The list specifies what computations we would like TensorFlow to run, as well as what we'd like run() to output:


In [35]:
sess_default.run(tf.add(3,2))


Out[35]:
5

TensorFlow will only perform calculations necessary to compute the values specified in fetches, so it won't waste time if you only need to run a small part of a large, complicated graph.

feed_dict

feed_dict is an optional parameter to run, but becomes required when placeholder nodes are included. We saw it used to feed input data to placeholders, but feed_dict can actually send values to any node. The keys to the dictionary should be handles to Tensor objects (usually outputs of Operations), and the values should be replacement data:


In [ ]:
# Create Operations, Tensors, etc (using the default graph)
a = tf.add(3, 4)
b = tf.mul(a, 5)

# Define a dictionary that says to replace the value of `a` with 15
replace_dict = {a: 15}

In [ ]:
# Run the session without feed_dict
# Prints (3 + 4) * 5 = 35
print(sess_default.run(b))

In [ ]:
# Run the session, passing in `replace_dict` as the value to `feed_dict`
# Prints 15 * 5 = 75 instead of 7 * 5 = 35
print(sess_default.run(b, feed_dict=replace_dict))

When using placeholders,TensorFlow insists that any calls to Session.run() include feed_dict values for all placeholders:


In [36]:
a = tf.placeholder(tf.int32, name="my_placeholder")
b = tf.add(a, 3)

In [38]:
# This raises an error:
try:
    sess_default.run(b)
except tf.errors.InvalidArgumentError as e:
    print(e.message)


You must feed a value for placeholder tensor 'my_placeholder' with dtype int32
	 [[Node: my_placeholder = Placeholder[dtype=DT_INT32, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

In [39]:
# Create feed dictionary
feed_dict = {a: 8}

# Now it works!
print(sess_default.run(b, feed_dict=feed_dict))


11

In [40]:
# Closing out the Sessions we opened up
sess_default.close()
sess_new.close()

TensorFlow Variables

The last fundamental TensorFlow class is the Variable. A TensorFlow Variable has persistent state across multiple calls to Session.run(), which means that learned parameters in machine" learning models are Variables. We can create a Variable with a starting value of 0 like so:


In [43]:
my_var = tf.Variable(0, name="my_var")

However, even though the object has been created, the value of the Variable has to be initialized separately with either of the tf.initialize_variables() or, more commonly, tf.initialize_all_variables() Operations. Remember that Operations must be passed into Session.run() to be executed:


In [46]:



Out[46]:
0

In [45]:
sess = tf.Session()
sess.run(tf.initialize_all_variables())

Having value initialization separated from object creation allows us to reinitialize the variable later if we'd like.

How that the Variable is initialized, we can tweak it's value! Let's do some basic incrementing with the Variable.assign() method:


In [49]:
increment = my_var.assign(my_var + 1)

for i in range(10):
    print(sess.run(increment))


21
22
23
24
25
26
27
28
29
30

You may notice that if you run the previous code multiple times in the notebook, the value persists and continues to climb. The Variable's state is maintained by the Session object, and the state will persist unless either the session is close, the Variable is re-initialized, or a new value is assigned to the Variable.


In [50]:
# Re-initialize variables
sess.run(tf.initialize_all_variables())

# Start incrementing, beginning from 0 again
for i in range(10):
    print(sess.run(increment))


1
2
3
4
5
6
7
8
9
10

Trainable

There are several optional parameters in the Variable constructor, but one to pay close attention to is trainable. It takes in a boolean value, which defaults to True, and specifies to TensorFlow whether the built-in optimization functions (which we will cover in a separate notebook) should affect this Variable. If a Variable in you model should not be adjusted during gradient descent, make sure to set its trainable parameter to False

tf.get_variable()

Though the basic Variable() constructor is intuitive and good for beginners, eventually we would encourage you to move on to using the tf.get_variable() method for creating and accessing Variable objects. It allows users to more easily share Variables across complicated models, where handles to exact Variables can be lost or hard to manage. We will show some examples with tf.get_variable() in another notebook, but do check out the official how-to, as tf.get_variable() is best practice.


In [ ]:
sess.close()

In [ ]: